Skip to content

Conversation

@prozolic
Copy link
Contributor

@prozolic prozolic commented Oct 16, 2025

Summary

This PR improves the performance of FrozenDictionary.ToFrozenDictionary method.

Implementation Approach

When the source is already Dictionary<TKey, TValue> and satisfies the condition Dictionary.Count != 0 && !Dictionary.Comparer.Equals(comparer), the processing is changed to the following point.

  • Iterate directly over newDictionary instead of source to eliminate casting overhead

Benchmark

BenchmarkDotNet v0.14.1-nightly.20250107.205, Windows 11 (10.0.26100.6584)
11th Gen Intel Core i7-11700 2.50GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 10.0.100-rc.1.25451.107
  [Host]     : .NET 10.0.0 (10.0.25.45207), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-SADIXB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

OutlierMode=DontRemove  PowerPlanMode=00000000-0000-0000-0000-000000000000  Toolchain=CoreRun  
IterationTime=250ms  MaxIterationCount=20  MemoryRandomization=True  
MinIterationCount=15  WarmupCount=1  

| Method              | Size  | Mean          | Error         | StdDev        | Median        | Min           | Max           | Code Size | Gen0     | Gen1     | Gen2     | Allocated |
|-------------------- |------ |--------------:|--------------:|--------------:|--------------:|--------------:|--------------:|----------:|---------:|---------:|---------:|----------:|
| Original_Dictionary | 1     |      50.77 ns |      5.017 ns |      5.778 ns |      48.29 ns |      45.60 ns |      68.21 ns |   2,243 B |   0.0409 |        - |        - |     344 B |
| PR_Dictionary       | 1     |      36.30 ns |      1.882 ns |      2.167 ns |      35.54 ns |      34.84 ns |      42.11 ns |   1,892 B |   0.0354 |        - |        - |     296 B |
| Original_Dictionary | 10    |     270.57 ns |     17.663 ns |     20.341 ns |     264.05 ns |     250.11 ns |     317.83 ns |   3,179 B |   0.1185 |        - |        - |     992 B |
| PR_Dictionary       | 10    |     173.65 ns |      3.315 ns |      3.256 ns |     173.26 ns |     169.91 ns |     182.74 ns |   2,634 B |   0.1124 |        - |        - |     944 B |
| Original_Dictionary | 100   |   2,308.68 ns |    171.190 ns |    197.143 ns |   2,239.27 ns |   2,146.19 ns |   2,915.64 ns |   3,194 B |   0.9931 |   0.0170 |        - |    8328 B |
| PR_Dictionary       | 100   |   1,371.95 ns |     31.095 ns |     35.809 ns |   1,362.03 ns |   1,348.04 ns |   1,513.92 ns |   2,646 B |   0.9881 |   0.0165 |        - |    8280 B |
| Original_Dictionary | 1000  |  19,235.59 ns |    697.651 ns |    803.416 ns |  18,891.83 ns |  18,611.36 ns |  21,727.45 ns |   2,992 B |   9.6957 |   1.8646 |        - |   81304 B |
| PR_Dictionary       | 1000  |  13,123.78 ns |    558.138 ns |    642.753 ns |  12,917.55 ns |  12,450.91 ns |  14,648.06 ns |   2,981 B |   9.6845 |   1.3478 |        - |   81256 B |
| Original_Dictionary | 10000 | 257,255.96 ns | 26,300.558 ns | 30,287.762 ns | 246,032.88 ns | 224,345.99 ns | 362,544.12 ns |   2,993 B | 124.0672 | 124.0672 | 124.0672 |  753284 B |
| PR_Dictionary       | 10000 | 179,099.95 ns |  6,462.982 ns |  7,442.779 ns | 178,413.09 ns | 168,169.41 ns | 198,089.87 ns |   1,573 B | 124.3421 | 124.3421 | 124.3421 |  753236 B |
[BenchmarkCategory(Categories.Libraries, Categories.LINQ)]
public class Perf_FrozenDictionary
{
    private Dictionary<int, int> _dictionary;

    [Params(1, 10, 100, 1000, 10000)]
    public int Size;

    [GlobalSetup]
    public void GlobalSetup()
    {
        _dictionary = Enumerable.Range(0, Size).ToDictionary(i => i, NonDefaultEqualityComparer<int>.Instance);
    }

    [Benchmark]
    [MemoryRandomization]
    public FrozenDictionary<int, int> Original_Dictionary()
    {
        return _dictionary.ToFrozenDictionary();
    }

    [Benchmark]
    [MemoryRandomization]
    public FrozenDictionary<int, int> PR_Dictionary()
    {
        return _dictionary.ToFrozenDictionaryPR();
    }
}


public sealed class NonDefaultEqualityComparer<TKey> : IEqualityComparer<TKey>
{
    public static NonDefaultEqualityComparer<TKey> Instance { get; } = new();
    public bool Equals(TKey x, TKey y) => EqualityComparer<TKey>.Default.Equals(x, y);
    public int GetHashCode([DisallowNull] TKey obj) => EqualityComparer<TKey>.Default.GetHashCode(obj);
}

@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 16, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Oct 16, 2025
}
else if (newDictionary.Count != 0 && !newDictionary.Comparer.Equals(comparer))
{
var dictionary = new Dictionary<TKey, TValue>(newDictionary.Count, comparer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could end up significantly over-allocating, no? Let's say the original use an Ordinal comparer, the new uses an OrdinalIgnoreCase comparer, and the original had every string entry copied many times just with different casing. The resulting dictionary's count is going to be much smaller than the originals, yet this will cause it to allocate much more memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub You're correct. Pre-allocating the dictionary with the original size may result in over-allocating when duplicate keys reduce the required capacity. I have removed the initial capacity specification.

Copy link
Member

@stephentoub stephentoub Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. At that point then, is the only difference this new code path adds is iterating over a Dictionary<> in a strongly-typed way rather than iterating over a dictionary via IEnumerable<KeyValuePair<>>? In this state, have you tried running the before/after benchmarks on .NET 10?

Copy link
Contributor Author

@prozolic prozolic Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the benchmark results accordingly.

BenchmarkDotNet v0.14.1-nightly.20250107.205, Windows 11 (10.0.26100.6584)
11th Gen Intel Core i7-11700 2.50GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 10.0.100-rc.1.25451.107
  [Host]     : .NET 10.0.0 (10.0.25.45207), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-SADIXB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

OutlierMode=DontRemove  PowerPlanMode=00000000-0000-0000-0000-000000000000  Toolchain=CoreRun  
IterationTime=250ms  MaxIterationCount=20  MemoryRandomization=True  
MinIterationCount=15  WarmupCount=1  

| Method              | Size  | Mean          | Error         | StdDev        | Median        | Min           | Max           | Code Size | Gen0     | Gen1     | Gen2     | Allocated |
|-------------------- |------ |--------------:|--------------:|--------------:|--------------:|--------------:|--------------:|----------:|---------:|---------:|---------:|----------:|
| Original_Dictionary | 1     |      50.77 ns |      5.017 ns |      5.778 ns |      48.29 ns |      45.60 ns |      68.21 ns |   2,243 B |   0.0409 |        - |        - |     344 B |
| PR_Dictionary       | 1     |      36.30 ns |      1.882 ns |      2.167 ns |      35.54 ns |      34.84 ns |      42.11 ns |   1,892 B |   0.0354 |        - |        - |     296 B |
| Original_Dictionary | 10    |     270.57 ns |     17.663 ns |     20.341 ns |     264.05 ns |     250.11 ns |     317.83 ns |   3,179 B |   0.1185 |        - |        - |     992 B |
| PR_Dictionary       | 10    |     173.65 ns |      3.315 ns |      3.256 ns |     173.26 ns |     169.91 ns |     182.74 ns |   2,634 B |   0.1124 |        - |        - |     944 B |
| Original_Dictionary | 100   |   2,308.68 ns |    171.190 ns |    197.143 ns |   2,239.27 ns |   2,146.19 ns |   2,915.64 ns |   3,194 B |   0.9931 |   0.0170 |        - |    8328 B |
| PR_Dictionary       | 100   |   1,371.95 ns |     31.095 ns |     35.809 ns |   1,362.03 ns |   1,348.04 ns |   1,513.92 ns |   2,646 B |   0.9881 |   0.0165 |        - |    8280 B |
| Original_Dictionary | 1000  |  19,235.59 ns |    697.651 ns |    803.416 ns |  18,891.83 ns |  18,611.36 ns |  21,727.45 ns |   2,992 B |   9.6957 |   1.8646 |        - |   81304 B |
| PR_Dictionary       | 1000  |  13,123.78 ns |    558.138 ns |    642.753 ns |  12,917.55 ns |  12,450.91 ns |  14,648.06 ns |   2,981 B |   9.6845 |   1.3478 |        - |   81256 B |
| Original_Dictionary | 10000 | 257,255.96 ns | 26,300.558 ns | 30,287.762 ns | 246,032.88 ns | 224,345.99 ns | 362,544.12 ns |   2,993 B | 124.0672 | 124.0672 | 124.0672 |  753284 B |
| PR_Dictionary       | 10000 | 179,099.95 ns |  6,462.982 ns |  7,442.779 ns | 178,413.09 ns | 168,169.41 ns | 198,089.87 ns |   1,573 B | 124.3421 | 124.3421 | 124.3421 |  753236 B |

Copy link
Contributor

@hez2010 hez2010 Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub from the updated benchmark result I see the original one that reserves the initial capacity allocated far less than the new one that doesn't do it. Instead of letting the backing storage to resizing its size to 2x each time (which would also result in over-allocating), pre-allocating the capacity that may be a bit more than the actual need seems definite more feasible to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's going to depend on how much of a difference is expected in count from the original collection that's using a different comparer. The benchmark is using a comparer with identical semantics, so the ideal is to just use the exact same capacity. If, however and for example, the original collection was using Ordinal, the new collection was using OrdinalIgnoreCase, and the original collection contained 1000 words each with 10 different casing variations, choosing that original count is going to result in 10x the allocation required whereas the normal growth strategy will only be 2x.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially suggested pre-allocating, but after considering @stephentoub's concern about the worst-case scenario, I agree that not setting the initial capacity is the better approach.
Since the optimal capacity depends on on the comparer's behavior, the dynamic growth strategy is more robust.

@stephentoub stephentoub added area-System.Collections and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Oct 16, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Collections community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants